The phylogeographical pattern of the Amur minnow Rhynchocypris lagowskii (Cypriniformes: Cyprinidae) in the Qinling Mountains

Abstract In this study, the phylogeographical pattern of the Amur minnow (Rhynchocypris lagowskii) widely distributed in the cold freshwaters of the Qinling Mountains was examined. A total of 464 specimens from 48 localities were sequenced at a 540‐bp region of the mitochondrial cytochrome b (Cytb) gene, and 69 haplotypes were obtained. The mean ratio of the number of synonymous and nonsynonymous substitutions per site (dN/dS) was 0.028 and indicated purifying selection. Haplotype diversity (h) and nucleotide diversity (π) of natural populations of R. lagowskii varied widely between distinct localities. Phylogenetic trees based on Bayesian inference (BI), maximum likelihood (ML), and maximum parsimony (MP) methods, and network analysis showed five well‐differentiated lineages, but these did not completely correspond to localities and geographic distribution. Meanwhile, analysis of molecular variances (AMOVA) indicated the highest proportion of genetic variation was attributed to the differentiation between populations rather than by our defined lineages. In addition, there was no significant correlation between the pairwise Fst values and geographic distance (p > .05). Based on the molecular clock calibration, the time to the most recent common ancestor (TMRCA) was estimated to have emerged from the Late Miocene to the Early Pleistocene. Finally, the results of demographic history based on the neutrality test, mismatch distribution, and Bayesian skyline plot (BSP) analyses showed that collectively, the populations were stable during the Pleistocene while one lineage (lineage E) probably underwent a slight contraction during the Middle Pleistocene and a rapid expansion from the Middle to the Late Pleistocene. Therefore, the study suggests the current phylogeographical pattern of R. lagowskii was likely shaped by geological events that led to vicariance followed by dispersal and secondary contact, river capture, and climatic oscillation during the Late Miocene to the Early Pleistocene in the Qinling Mountains.


| INTRODUC TI ON
Phylogeography is the study of historical processes that may be responsible for the contemporary geographic distributions of genealogical lineages within and among closely related species and is primarily conducted using molecular markers (Avise, 2000;Bowen et al., 2016;Chen et al., 2020;Hardouin et al., 2018;Li et al., 2011;Schneider et al., 1998;Wang et al., 2012;Wu et al., 2009;Yu et al., 2014). It can be used to identify different historical forces, such as population expansion, bottlenecks, climate oscillation, vicariance, and migration, analyze the variation in population distributions, and reconstruct the evolutionary processes of fauna and flora (Huang, 2012). Animal mitochondrial DNA (mtDNA) had been widely used in phylogenetics for systematic, population genetics, phylogeography, and comparative phylogeography (Avise et al., 1987;Bowen et al., 2016;Chen et al., 2020;Hardouin et al., 2018;Yu et al., 2014) and was employed in this study due to its maternal inheritance, rapid mutation rate, and low level of intermolecular genetic recombination (Brown et al., 1979;Clayton, 1982;Giles et al., 1980). The Qinling Mountains represent a natural boundary between the northern and southern regions of the country and separate the Chinese temperate and subtropical climatic zones (Ding et al., 2013), resulting in differentiated terrestrial and freshwater fauna (Li, 1981;Zhang, 2011). Meanwhile, the rapid uplift of these mountains and climatic oscillation were influenced by the Qinghai-Tibet Plateau movement from the Miocene to the Pleistocene and have played important roles in influencing the phylogeographical patterns of a variety of organisms, including parasite, amphibian, fish, and mammal species (Chen et al., 2020;Hardouin et al., 2018;He et al., 1992;Hu et al., 2021;Huang et al., 2017;Li et al., 2021;Liu et al., 2015;Meng et al., 2014;Shao et al., 2019;Shi, 2002;Wang et al., 2012Wang et al., , 2013Yu et al., 2014;Zhang & Fang, 2012).
The Amur minnow (Rhynchocypris lagowskii) ( Figure 1) is a small cyprinid widely distributed in cold freshwater from the Lena and the Amur Rivers southward to the Yangtze drainage in East Asia (Bogutskaya et al., 2008;Chen, 1998;Min & Yang, 1986). R. lagowskii and related fish species were selected for phylogeographic studies because their specific ecological upstream distribution resulted in much smaller population sizes, low dispersal ability, and restricted gene flow (Hassan et al., 2015;Higuchi & Watanabe, 2005;Kang et al., 2000;Min & Yang, 1986;Nishida et al., 2015;Sakai et al., 2014;Xue et al., 2017;Yu et al., 2014;Zhang & Chen, 1997). The divergence times of other parasite and fish species in this region were estimated to have occurred during the Early to the Middle Pleistocene and followed the rapid uplift of the Qinling Mountains (Chen et al., 2020;Hardouin et al., 2018;Yu et al., 2014;Zhang & Fang, 2012). However, the time to the most recent common ancestor (TMRCA) of the congeneric species Rhynchocypris oxycephalus, Rhynchocypris percnurus, and R. lagowaskii was estimated to be in the Late Miocene. As a consequence, the phylogeographical pattern of R. oxycephalus was shaped by the geological events and Pliocene climate fluctuations, but this study only used four individuals of R. lagowaskii as the outgroup taxa (Yu et al., 2014).
Unfortunately, there have been limited studies and insufficient sampling of R. lagowskii in the Qinling Mountains (Yu et al., 2014), so it has been difficult to evaluate the phylogeographical pattern of R. lagowskii in this region. Therefore, the purpose of the study is to use a larger number of specimens and cover a wide geographical range in the Qinling Mountains to illustrate the phylogeographical pattern of the species. For this work, we explored the phylogeographical pattern of R. lagowskii, based on the cytochrome b (Cytb) gene sequences of 464 specimens from 48 geographical localities in the main Qinling Mountains. In addition, the study also assessed genetic differentiation between populations, demographic history, and the effects of geological events or climate oscillation during the Pleistocene.

| Ethics statement
The study was approved by the Animal Care and Use Committee of Shaanxi Normal University. The species is not evaluated in the IUCN red list status (https://www.iucnr edlist.org). None of the species sampled are endangered or protected in China (Yue & Chen, 1998).
Fish sampling is permitted by the local level authority in scientific research.

| Sample collection
Specimens of R. lagowskii were sampled from 48 localities, which covered most regions of the Qinling Mountains, from May to October in 2016 and 2017 ( Figure 2 and Table 1). The fish were rapidly euthanized by a blow to the head and stored in 96% ethanol

T A X O N O M Y C L A S S I F I C A T I O N
Biogeography F I G U R E 1 Amur minnow, Rhynchocypris lagowskii within three minutes. Subsequently, the specimens were examined microscopically in the laboratory, and species identification was performed based on the morphological characteristics (Chen, 1998

| DNA extraction, PCR amplification, and direct sequencing
Total genomic DNA of R. lagowskii was extracted from each specimen following the operation instruction of the TIANamp Marine Animals DNA Kit (Tiangen Biotech, Beijing, China). The forward primer Cytb-F (5'-ATGGCAAGCCTACGAAAAAC-3') and the reverse primer Cytb-R (5'-GATTACAAGACCGATGCTTT-3') designed based on the same species (Zhao et al., 2016) were used to amplify a 540-bp fragment of the mitochondrial cytochrome b (Cytb) gene by polymerase chain reaction (PCR) for each specimen. PCR amplification was performed in a total volume of 25 µL, containing 3 mM MgCl 2 , 10 mM Tris-HCl (pH 8.3), 50 mM KCl, 0.25 mM of each dNTP, 1.25 U rTaq polymerase (TaKaRa, Dalian, China), 0.4 μM of each primer, 45 ng gDNA, tapped with Milli-Q water. The following cycling conditions were applied: initial denaturation for 1 min at 93°C followed by 35 cycles of denaturation for 10 s at 92°C, annealing for 1.5 min at 51°C, and extension for 2 min at 60°C with a final extension for 6 min at 72°C (Chen et al., 2020). All fragments were initially purified with a PCR purification kit (BGI Biotech, Shenzhen, China), subsequently subjected to electrophoresis in a 1% agarose gel, and finally sequenced with the forward primer using an ABI Prism®3730 automated sequencer (Applied Biosystems, Foster City, USA).

| Phylogenetic and network analyses
Before phylogenetic analysis, a substitution model for the haplotype dataset was determined using the Bayesian information criterion (BIC) in jModelTest v2.2.10 ( Darriba et al., 2012). As a result, the TrN model (Tamura & Nei, 1993) of molecular evolution with the gamma shape parameter (TrN+G) was selected. Subsequently, phylogenetic trees based on the mitochondrial Cytb haplotypes were reconstructed using the Bayesian inference (BI), maximum likelihood (ML), and maximum parsimony (MP) methods. The congeneric species R. percnurus was selected as an outgroup (Imoto et al., 2013). Maximum parsimony analysis was implemented in PAUP* v4.0b10a (Swofford, 2002). Heuristic searches with tree-bisectionreconnection were executed for 1000 random addition replicates with all characters treated as unordered and equally weighted.
Maximum likelihood analysis was conducted using RAxML v7.2.8 (Stamatakis et al., 2005), with bootstrap analysis performed with 1,000 replicates. Bayesian inference analysis was performed using MrBayes v3.1.2 (Ronquist & Huelsenbeck, 2003), and one set of four chains was allowed to run simultaneously for 15 million generations. The trees were sampled every 1000 generations, with the first 25% being discarded as burn-in. Stationarity means that the log-likelihood kept a stable level with the sampled generations increasing, and it was considered to be reached when the average standard deviation of split frequencies was below 0.01. A medianjoining haplotype network was then constructed using PopART v1.7 (Leigh & Bryant, 2015).

| Population genetic structure
The mean genetic distances among the lineages identified in our phylogenetics analysis (see results) were calculated by an uncorrected p-distance model using MEGA v6.06 (Tamura et al., 2013).
Subsequently, the analysis of molecular variances (AMOVA) was performed to investigate the level of genetic variation between populations using pairwise differences F-statistics in Arlequin v3.5.1.2 (Excoffier & Lischer, 2010). Finally, the correlation between the pairwise Fst values of the individuals from localities and their geographic distances (km) was analyzed to test for isolation by distance (IBD) (Slatkin, 1993) and assessed using linear regression in GraphPad Prism v 5.0 (www.graph pad.com).

| Divergence time estimation
For divergence time estimation among the lineages identified from the well-supported phylogenetic clades, a coalescent time estimation method was used in BEAST v1.6.1 (Drummond & Rambaut, 2007). The divergence times of each lineage were estimated using the TN93+G as a site model, an uncorrelated lognormal relaxed molecular clock model, and a birth-death speciation tree prior (Ritchie et al., 2017). The mutation rate of 1% per million years ago (Mya) was adopted based on the phylogeography studies with the Cytb gene in cyprinid fish (Durand et al., 2002). Bayesian Markov Chain Monte Carlo (MCMC) analyses were performed for 15 million generations while sampling every 5,000 th tree, and the first 10% of the trees sampled were treated as burn-in. Subsequently, the estimates and convergence of effective sample size (ESS) for all parameters larger than 200 were checked with Tracer v1.7.1 (Rambaut et al., 2018), and all resulting trees were combined with LogCombiner v1.7.3 (Drummond & Rambaut, 2007). Finally, a maximum credibility tree was produced using TreeAnnotator v1.5.3 (Drummond & Rambaut, 2007), visualized, and annotated in FigTree v1.4.2 (Rambaut, 2014).

| Demographic history
Three methods were used with the haplotype dataset to trace the demographic history. Firstly, the neutrality test between the values of Tajima's D (Tajima, 1989) and Fu's Fs (Fu & Li, 1993) was used to test for neutral evolution in Arlequin v3.5.1.2. Subsequently, the mismatch distribution between the values of the sum of squared deviations (SSD) and Harpending's raggedness index (HRI) was used to test for signals of demographic expansion using Arlequin v3.5.1.2 (Harpending, 1994). Meanwhile, the beginning time of expansion (t) was calculated following a formula (t = tau/4uk, the value of tau is expansion parameter, generated by mismatch distribution with Arlequin v3.5.1.2, the value of u is the mutation rate per nucleotide, and the value of k is the length of nucleotide sequences) (Rogers & Harpending, 1992). Finally, the Bayesian skyline plot (BSP) analysis was performed with strict clock estimation using the TN93+G substitution model with a mutation rate of 1% per Mya and 15 million generations to describe demographic history by assessing the variation between time and ESS in BEAST v1.6.1 and Tracer v1.7.1.   The haplotype diversity (h) and nucleotide diversity (π) across the samples were obtained (Table 1). The highest values of h and π were analyzed in the two localities (population code: BD and XYG) samples, respectively, while the lowest values of h and π were obtained in four localities samples (population code: GJZ, HZZ, LYG, and SZB).

| Genetic diversity
In addition, the most common haplotype (Hap10) was distributed in ten localities and were accounted for in 18.5% of the individuals (86 of 464), including 75 samples in the Han River, a sample in the Wei River, and ten samples in the Jialing River, respectively. Interestingly, the relationship between the genetic diversity and the population latitude and longitude was uncorrelated.

| Phylogenetic and network analyses
The phylogenetic analysis found that the identified haplotypes

| Population genetic structure
The mean genetic distance among our defined lineages ranged from 3.5% to 13.9% in the Qinling Mountains ( Table 2). The lowest value was found between the lineages C and E samples, whereas the largest value occurred between the lineages C and D samples.
Subsequently, the analysis of molecular variances (AMOVA) indicated that the highest proportion of genetic variation (45.5%) was attributed to the differentiation between populations, whereas the lowest proportion of genetic variation (10.5%) was attributed to the differentiation among our defined lineages (A-E). Meanwhile, all the variance values of fixation indices were significant, showing that most of the genetic variation was a partition between populations rather than by among our defined lineages in the Qinling Mountains (Table 3). Furthermore, there was no significant correlation between the pairwise Fst values of the individuals from localities and their geographic distances (R 2 = 0.00003154, p = .852) ( Figure 5), rejecting the IBD model.

| Divergence time estimation
Coalescent techniques were applied to estimate the time to the most recent common ancestor (TMRCA) and divergence times of the five lineages. TMRCA of the whole ingroup dated to 7.03 Mya.
The divergence times among lineages diverged from 5.63 Mya to 2.38 Mya (Figure 3). All divergence times occurred during the Late Miocene to the Early Pleistocene.

| Demographic history
The neutrality test showed population stability except for lineage E (Table 4) (Li, 1981;Zhang, 2011). Previous studies have shown that these mountains have played an important role as a geographical barrier in shaping the significant phylogeographic patterns of many species with a low dispersal ability, including amphibian, fish, and mammal species and may have led to the fragmentation of populations by vicariance (Hardouin et al., 2018;Huang et al., 2017;Li et al., 2021;Liu et al., 2015;Meng et al., 2014;Shao et al., 2019;Wang et al., 2012Wang et al., , 2013Yu et al., 2014).
Subsequently, our study found that the phylogenetic trees, network, and AMOVA yielded well-differentiated lineages. A clear phylogeographic pattern was observed with some shared haplotypes of R. lagowskii across geographic locations. The current pattern was likely shaped by lower dispersal ability, wide sampling, and past vicariance (genetic isolation among adjacent areas by natural barriers) followed by a dispersal and secondary contact. The mean genetic distance among haplotypes of R. lagowskii ranged from 3.5% to 13.9% (overall mean distance, 6.5%). This was higher than that found in amphibians S. ningshanensis (2.4% to 4.2%) and B. tibetanus (0.0% to 3.7%) but similar to values found in other fish and amphibian species, including R. oxycephalus (6.5% to 7.4%) and lagowskii and this lacked IBD. This is similar to results from some other amphibian and mussel species Liu et al., 2017;Wang et al., 2013). Our study showed the current phylogeographical pattern of R. lagowskii was likely affected by past vicariance (genetic isolation among adjacent areas by natural barriers) followed by dispersal and secondary contact.
Therefore, based on phylogenetic trees, network, mean genetic distance, AMOVA, and IBD analyses of R. lagowskii, we found that as R. lagowskii is primarily found in clear cold freshwater from the midstream to upstream in East Asia (Kang et al., 2000;Nishida et al., 2015;Zhang & Chen, 1997), the specific ecological upstream distribution results in much smaller population size and higher differentiation between populations (Yu et al., 2014). In addition, the rapid uplift of the Qinling Mountains during the Miocene to the Holocene, especially starting at the Early Pleistocene modified the topography of this area, potentially creating more isolated geographic pockets of suitable habitat (Zhang & Fang, 2012), and river capture related to the new tectonic movements also occurred and accelerated fish dispersal based on some well-supported phylogenetic lineages and the values of genetic diversity between the samples from watershed localities in this region (Zhang, 1962). So it is stated that vicariance and uplift promote differentiation but river capture promotes dispersal (Albert & Crampton, 2010;Albert et al., 2017;Zhang & Chen, 1997;Zhang & Fang, 2012). Also, upstream barriers might limit gene flow (Blasco-Costa et al., 2012).

| Genetic diversity change during the glaciation and purifying selection
The R. lagowskii showed high levels of genetic diversity with many unique haplotypes and with some haplotypes being found across wide geographic regions indicating limited gene flow.
However, there were three unique haplotypes and four shared haplotypes in ten of the sampled localities and these populations had extremely low levels of genetic diversity. This is different from

Geographic distance (km)
Fst the results that found eleven unique haplotypes and one shared haplotype in twelve localities populations of R. oxycephalus (Yu et al., 2014). The genetic diversity loss across these low diversity localities is likely the result of genetic drift, bottleneck, founder effect, habitat fragmentation, inbreeding, and gene flow deficiency (Hunter & Gibbs, 2006 (Yu et al., 2014).
Some of the genetic patterns observed in R. lagowskii may be the results of various refugia that were available during the LGM. During this period, ice sheet or glaciation extended south to the high latitude and altitude regions in Europe and North America at the LGM (Hewitt, 1996). The fish and vertebrate species expanded mainly from southern refugia in more recent interglacials, reducing genetic diversity, and forming distinct phylogeographical patterns after the LGM (Hewitt, 1996;Rowe et al., 2004). The high altitude region in the Qinling Mountains was also affected by Taibai glaciation (0.019 Ma) (Shi, 2002); however, the genetic diversity of R. lagowskii did not show the significant latitude and altitude decreasing trend often observed with other mussel, fish, amphibian, and mammal species in the Qinling Mountains and other regions of China (Hardouin et al., 2018;Li et al., 2021;Liu et al., 2015Liu et al., , 2017Shao et al., 2019;Wang et al., 2012;Xue et al., 2017;Yu et al., 2014 (Kimura, 1977;Yang & Bielawski, 2000). Our study also showed the mean dN/dS value was 0.028 for Cytb. Past work evaluating neural crest-associated genes and mitochondrial proteincoding genes in fish have suggested that when the dN/dS value is below 0.1 this suggest the genes are under strong purifying selection and slow evolution, safeguarding the biological function of proteins against deleterious mutations (Kratochwil et al., 2015;Lu et al., 2019;Rand & Kann, 1998). Therefore, the populations of R. lagowskii probably underwent purifying selection with slow evolution and rapid adaptation with an incomplete mitochondrial gene and a relatively low value of nucleotide diversity (π) against nonsynonymous substitutions often observed with 13 mitochondrial oxidative phosphorylation (OXPHOS) genes among other cyprinid species (Lu et al., 2019). More genetic data would be needed to test this conclusion in further study.

| Population divergence and demographic history
The divergence time among all lineages of R. lagowskii was during the Late Miocene to the Early Pleistocene and is similar to results from some amphibian and fish species in this region (Hardouin et al., 2018;Huang et al., 2017;Yu et al., 2014), representing an ancient separation. Previous studies showed the phylogeographic patterns of some amphibian and fish species were profoundly influenced by climate oscillations and tectonic barriers with the rapid uplift of the Qinling Mountains during this period (Gao et al., 2012;Meng et al., 2014;Wang et al., 2013;Yu et al., 2014).
The neutrality test, except lineage E, showed population stability in this region with other amphibian, fish, and mammal species (Hu et al., 2021;Meng et al., 2014;Yu et al., 2014). However, the mismatch distribution, except lineage E, did not reject the hypothesis of sudden expansion. In addition, lineage E was unimodal and showed expansion, whereas others were stable. The beginning time of expansion (t), except lineage E, was during the Early to the Late Pleistocene before the LGM following the rapid uplift of these mountains (Zhang & Fang, 2012 to what has been found in other amphibians and fish in this region (He et al., 1992;Huang, 1982;Wang et al., 2013;Yu et al., 2014 (Yu et al., 2014).

| CON CLUS IONS
Our studies suggest the current phylogeographical pattern of R.

CO N FLI C T O F I NTE R E S T
The authors declare that they have no competing interests.